Prior choice and data requirements of Bayesian multivariate hierarchical models fit to tag‐recovery data: The need for power analyses

Abstract Recent empirical studies have quantified correlation between survival and recovery by estimating these parameters as correlated random effects with hierarchical Bayesian multivariate models fit to tag‐recovery data. In these applications, increasingly negative correlation between survival and recovery has been interpreted as evidence for increasingly additive harvest mortality. The power of these hierarchal models to detect nonzero correlations has rarely been evaluated, and these few studies have not focused on tag‐recovery data, which is a common data type. We assessed the power of multivariate hierarchical models to detect negative correlation between annual survival and recovery. Using three priors for multivariate normal distributions, we fit hierarchical effects models to a mallard (Anas platyrhychos) tag‐recovery data set and to simulated data with sample sizes corresponding to different levels of monitoring intensity. We also demonstrate more robust summary statistics for tag‐recovery data sets than total individuals tagged. Different priors led to substantially different estimates of correlation from the mallard data. Our power analysis of simulated data indicated most prior distribution and sample size combinations could not estimate strongly negative correlation with useful precision or accuracy. Many correlation estimates spanned the available parameter space (−1,1) and underestimated the magnitude of negative correlation. Only one prior combined with our most intensive monitoring scenario provided reliable results. Underestimating the magnitude of correlation coincided with overestimating the variability of annual survival, but not annual recovery. The inadequacy of prior distributions and sample size combinations previously assumed adequate for obtaining robust inference from tag‐recovery data represents a concern in the application of Bayesian hierarchical models to tag‐recovery data. Our analysis approach provides a means for examining prior influence and sample size on hierarchical models fit to capture–recapture data while emphasizing transferability of results between empirical and simulation studies.

evidence for increasingly additive harvest mortality. The power of these hierarchal models to detect nonzero correlations has rarely been evaluated, and these few studies have not focused on tag-recovery data, which is a common data type. We assessed the power of multivariate hierarchical models to detect negative correlation between annual survival and recovery. Using three priors for multivariate normal distributions, we fit hierarchical effects models to a mallard (Anas platyrhychos) tag-recovery data set and to simulated data with sample sizes corresponding to different levels of monitoring intensity. We also demonstrate more robust summary statistics for tagrecovery data sets than total individuals tagged. Different priors led to substantially different estimates of correlation from the mallard data. Our power analysis of simulated data indicated most prior distribution and sample size combinations could not estimate strongly negative correlation with useful precision or accuracy. Many correlation estimates spanned the available parameter space (−1,1) and underestimated the magnitude of negative correlation. Only one prior combined with our most intensive monitoring scenario provided reliable results. Underestimating the magnitude of correlation coincided with overestimating the variability of annual survival, but not annual recovery. The inadequacy of prior distributions and sample size combinations previously assumed adequate for obtaining robust inference from tag-recovery data represents a concern in the application of Bayesian hierarchical models to tagrecovery data. Our analysis approach provides a means for examining prior influence and sample size on hierarchical models fit to capture-recapture data while emphasizing transferability of results between empirical and simulation studies.

| INTRODUC TI ON
Modern approaches to quantifying relationships among demographic parameters include modeling temporal variation in vital rates as correlated random effects drawn from multivariate normal distributions (Fay et al., 2021;Link & Barker, 2005;Riecke et al., 2019). This approach allows the correlation between parameters to be estimated without bias from sampling covariation between parameters (Otis & White, 2004). Bayesian modeling frameworks offer a more tractable approach to fitting relatively complex random effects structures when compared to Frequentist approaches, thereby Bayesian estimation is a natural choice when fitting multivariate hierarchical models (Royle & Link, 2002). While Bayesian estimation is advantageous when fitting complex random effects structures, Bayesian inference includes the influence of prior distributions on posterior distributions (Gelman et al., 2014).
In both conservation and management settings, demographic observations are often obtained through capture-recapture methods (Williams et al., 2002). Multivariate hierarchical models specific to capture-recapture data have estimated correlations between survival and recruitment (Link & Barker, 2005), juvenile and adult survival (Riecke et al., 2019), survival and reproduction (Paterson et al., 2018), reproductive effort and reproductive frequency (Badger et al., 2020), and parameters estimated with integrated population models Schaub et al., 2013). Additionally, some studies have focused on quantifying the impact of harvest on vital rates like annual survival (or natural mortality) by estimating correlation between random effects from tag-recovery data, which is a capture-recapture data type in which individuals are reencountered after experiencing mortality (Arnold et al., 2016;Bartzen & Dufour, 2017;Koons et al., 2014;Servanty et al., 2010). Tag-recovery data are advantageous in that the harvest of tagged individuals provides information on cause-specific mortality so that both survival probability and tag-recovery probability can be estimated from this single data type (Brownie et al., 1985;Otis & White, 2004).
With capture-recapture data, statistical power depends on both total tags deployed and the portion of those individuals reencountered after initial tagging (Williams et al., 2002). For waterfowl species in North America, it is common for <10% of tagged individuals to be recovered by hunters (Cooch et al., 2014). Thereby, it is conceivable to have apparently large data sets (tens of thousands of tagged individuals) but to still have limited data from which parameters can be estimated (Sheaffer & Malecki, 1995). Despite wide use of tagrecovery and other capture-recapture data types, a congruent approach to quantifying and reporting the sample sizes associated with resighting, recapturing, or recovering tagged individuals is lacking.
With tag-recovery data, total years that tagged individuals are known to be alive before being recovered (known-fate years) is a function of total animals tagged along with both the expected lifespan and recovery probability of tagged individuals. With tagrecovery data, direct recoveries do not contribute to known-fate years, as these recoveries occur immediately following initial tagging when no natural mortality is assumed to occur (Brownie et al., 1985).
Even though direct recoveries are required for parameter identifiability and improve the precision of parameter estimates by increasing the number of individuals for which a fate is known, these recoveries do not offer much information about the probability of survival for one or more periods in which mortality is assumed to occur (Williams et al., 2002). In the case of waterfowl, indirect recoveries are those that occur after one hunting season and one year have elapsed after being banded.
Available data in the form of total recoveries and known-fate years (which come from indirect recoveries) affect parameter estimation when modeling tag-recovery data, but in Bayesian models, prior distributions can also affect posterior distributions and may interact with available data to reduce the precision and accuracy of parameter estimates (Gelman et al., 2014). In the context of multivariate normal distributions, prior choice is an important consideration as recent research has shown that correlation between random effects is sensitive to the priors used (Riecke et al., 2019). In these applications, the covariance matrix of a multivariate normal distribution ( ) contains the standard deviations of random effects ( ) and correlation between random effects ( ). Previous simulation research focused on modeling capture-recapture data with Cormack-Jolly-Seber models has shown the magnitude of correlation between random effects is underestimated when placing Wishart priors on the precision matrix −1 used to estimate correlated random effects (Riecke et al., 2019). These authors also report a prior formulation for the variance-covariance matrix that placed Uniform priors on the standard deviations of random effects and the correlation parameter estimated correlation with less bias than the Wishart prior these authors assessed (Riecke et al., 2019).
Correlations between demographic parameters have long been recognized as biologically meaningful (Anderson & Burnham, 1976;Nichols & Hines, 1983), and in the context of harvest management, negative correlation between the random effects of year on survival and harvest can be interpreted as evidence for harvest mortality that is additive to natural mortality (Arnold, Afton, et al., 2017). Specifically, posterior distributions of correlation between survival and recovery have been interpreted as providing evidence for strongly additive ( < − 0.7), moderately additive ( − 0.7 < < − 0.5), weakly addi- Bayesian analysis, capture-recapture, harvest assessment, hierarchical models, multivariate normal distribution, random effects

T A X O N O M Y C L A S S I F I C A T I O N
Demography, Population ecology mortality (Arnold, Afton, et al., 2017). While it should be expected that the interaction between priors and effective sample size will interact along a spectrum when calculating posterior estimates of correlation between random effects (Gelman et al., 2014), guidelines like those suggested by Arnold, Afton, et al. (2017) have been recommended without assessing whether the data have enough power to robustly support these interpretations. We suggest an understanding of these interactions may be of key importance when interpreting correlation parameters for the purpose of informing harvest management.
Here, we assess the power of multivariate hierarchical models fit with Bayesian estimation to detect negative correlations between annual survival and recovery when these parameters are estimated as random effects. For our case study, we use banding and recovery data from the midcontinent mallard (Anas platyrhychos) population, one of the largest tag-recovery data sets in the world, for the years 1961-1996. We focus on tag-recovery data and the widely implemented model of Brownie et al. (1985) (White & Burnham, 1999). We also summarize known-fate years for both empirical and simulated data sets while quantifying the power of our models to detect strongly negative correlation with respect to different monitoring scenarios and the three prior formulations we used to initialize Bayesian models fit to the mallard data (Johnson et al., 2015). Our case study focuses on female mallards, as we show unexpected data limitations associated with the female data when compared to the male data.

| Mallard banding data
We began our case study using data from mallards banded in the Central and Mississippi flyways (hereafter midcontinent mallards) as defined by the U.S. Fish and Wildlife Service [USFWS] (2017) from 1961-1996; these years were chosen because reporting probability of harvested mallards wearing metal leg bands varied little and ranged from 0.3 to 0.4 during these years (Arnold et al., 2020).
The mid-1990s also coincides with the implementation of Adaptive Harvest Management for mallards, which led to a change in the decision-making process used to recommend harvest regulations for this population (USFWS, 2020). We acquired banding records for mallards marked with regular metal bands between 1 June and 30 September from the USGS Bird Banding Lab (BBL; Laurel, MD, USA).
For these mallards, we also obtained recovery records for those individuals that were recovered by hunters (BBL code HOW = 1) between 1 September and 30 April. We excluded recoveries reported in Alaska, northern and eastern Canadian provinces (provinces NB, NL, NS, NT, NU, PE, and YT), and Mexico because recoveries from these regions were rare or band reporting probabilities are either unknown or lower than the rest of North America (Arnold et al., 2020).
With these banding and recovery data, we constructed m-arrays (M age ) that summarize recoveries by cohort (rows) and recovery year (columns) for each age and sex class of mallards for analysis with multinomial models (Brownie et al., 1985). The m-array is square for y years of tagging and recovery data (Y = 36) with total unrecovered individuals per cohort included in the m-array as an additional, last column so that there are y rows and y + 1 columns. Each m-array contains two recovery types: direct and indirect. Direct recoveries are found along the main diagonal of each m-array and are those occurring during the hunting season beginning in the same year as tagging when no mortality is assumed to occur. Indirect recoveries are found to the right of the main diagonal of the m-array (and to the left of the final column) and are those recoveries occurring at least y + 1 years after tagging.
To complement our empirical analysis ( Table 1) and power analyses ( Table 2), we report conventional summary statistics for tagrecovery data in the form of total releases, total direct recoveries, and total indirect recoveries. In addition, we report a metric of available data in the form of total known-fate years, average known-fate

| Correlation models
We parameterized the multinomial formulation of the Brownie et al. (1985) band-recovery model in a Bayesian framework to estimate annual survival (S) and recovery (f) probabilities for two age classes (Brownie et al., 1985, Kéry & Schaub, 2012, juvenile (i.e., hatch year; HY) and adult (i.e., after-hatch year; AHY). A common notation for this model structure would be S age,year f age,year ; we exclude "sex" from the notation as we analyzed the female and male data separately. With this model formulation, the recovery parameter (f) is the joint probability of being shot, retrieved, and reported (Brownie et al., 1985). We note the models specified by  implement the Seber r parameterization in place of the Brownie f parameterization, which is a difference regarding the estimation of recovery probability (Cooch et al., 2014 pp. 246-248). When specifying the Brownie f parameterization, we did not distinguish between direct recovery probability and indirect recovery probability. We did not parameterize models with the Seber r formulation (Sedinger et al., 2010) as this method is prone to providing incorrect inference when temporal variation in natural mortality exceeds variation in harvest mortality ( Figure 1; Code available in supporting information online). While future work could further assess the Seber r parameterization for correlation analyses, the mathematical difficulties demonstrated in Figure 1 preclude us from further considering this parametrization here.
We estimated the probability of observed tag-recovery data M age,y,1:Y using multinomial distributions with success probabilities for juveniles ( ) and adults ( ) and the total individuals released (R age ) during each year (Equation 1).
We defined the age-specific cell probabilities as a function of annual survival and recovery probability. Cell-probabilities correspond- We estimated annual survival and recovery probabilities on the logit scale while using random effects to model annual variation in survival S y and recovery f y relative to the hierarchical mean survival S and recovery f probability for each age class (Equations 3, 4). Neither the hierarchal means nor random effects were shared between age classes, thereby we omit the age notation TA B L E 2 Summary of tag-recovery data for the 50 tag-recovery realizations corresponding to each sampling scenario we included in our power analysis.  We drew random effects from multivariate normal distributions with a mean of 0 and a variance-covariance matrix ( ) or precision matrix −1 depending on which of the three priors we specified for these matrices. Our decision to place priors on the variancecovariance matrix or the precision matrix was due to practical limitations in Program JAGS related to ensuring these matrices are positive definite (Plummer, 2003). Here, we consider three models that differ in the priors we used to initiate multivariate normal distributions.
The first prior was a Wishart distribution (Link & Barker, 2005; Our prior for correlation is vague like a single Uniform distribution prior, but the use of a Beta distribution would allow for a shaped prior if desired. When implementing these flat priors, random effects for survival and recovery were drawn from multivariate normal distributions with a mean of 0 and variance-covariance matrix using the JAGS function dmnorm.vcov (Plummer, 2003).
(3) Our third prior formulation placed priors on components of the precision matrix which is like the default priors specified for multivariate normal distributions in Program MARK (White & Burnham, 1999). In this case, we specified the distribution Gamma(1.001,0.001) for the parameters found along the main diagonal of the precision matrix 1 ∕ 2 and a flat prior for the equivalent of a correlation parameter ( * ) for the precision matrix (Equation 7). With this prior, random effects were estimated from multivariate normal distributions with mean values of 0 and a precision matrix. Here, we calculated correlation from the variancecovariance matrix after inverting the precision matrix in Program JAGS (Plummer, 2003).
For clarity, we consistently present results in order of juvenile females, adult females, juvenile males, and adult males.

| Power analysis
After fitting our models to the mallard data, we assessed our power to detect negative correlation between survival and recovery with the models we used for the empirical analysis ( Figure 2); our approach was guided by a simple question, "will my study answer my research question?" (Johnson et al., 2015). We began by simulating age-specific survival and recovery probabilities for a 36-year period using multivariate normal distributions and parameters we estimated for female mallards with Uniform priors while also specifying correlation between survival and recovery to be −0. tality when band-reporting probability is 0.4 (Appendix S1), which is close to the estimated band-reporting probability during the years of our study (Arnold et al., 2020). With the annual survival and recovery probabilities that we simulated for each class (Figure 3b, 4b), we simulated known-fate histories for an entire population of individuals (CH known ). From these known-fate capture histories, we simulated observed capture histories (CH observed ) only containing initial encounters and recoveries and (4) intensive (HY 10,000 , AHY 10,000 ). The modest scenarios are comparable to sample sizes available for female lesser scaup (Aythya affinis; Arnold et al., 2016), the intermediate scenario is like sample sizes available for northern pintails (Anas acuta; Bartzen & Dufour, 2017), and the intensive scenario is more like the sample sizes available for mallards.
We then randomly sampled capture histories from CH observed using the criteria of each monitoring scenario to obtain 200 realized data sets, 50 for each monitoring scenario (Figure 2f). For data set i from monitoring scenario ms, we summarized the sampled capture histories to m-array format (M i,ms,age ). We sampled without replacement within each data set (M i,ms,age ) and with replacement among data sets M i,ms,age .
We then fit the same three Brownie models to each data set M i,ms,age (with Wishart, Gamma, and Uniform priors) to estimate the same parameters as for the mallard data (Figure 2g), including correlation between survival and recovery ̂ i,ms,age . We evaluated our power to detect strongly negative correlation by comparing median (50th quantile) estimates of ̂ i,ms,age to ρ R,age . We also calculated the portion of the posterior estimates of correlation that fell within the bins that associate values of correlation to varying degrees of additive mortality (Figure 2h) suggested by Arnold, Afton, et al. (2017).
Our power analysis focused on the combination of sampling and parameter estimation to capture truth of a population, which in our case is ρ R,age . We did not focus on the ability of a model to estimate parameters from a sample (Riecke et al., 2019). Both approaches have merits depending on intention, and our intention was to focus on our ability to detect strongly negative correlation; this goal is best served by the approach we implemented here.

| Mallard analysis
Based on age at banding, we obtained records for 322,257 hatchyear females, 310,295 after-hatch year females, 375,574 individuals as hatch-year males, and 584,851 after-hatch year males ( Table 1).
On a per individual basis, there were twice as many known-fate years for every male (≈0.2) as there was for every female (≈0.1), but expected known-fate years per individual did not meaningfully vary within sex by release age (Table 1).   4) as well as posterior distributions of correlation between random effects were updated by the data such that the estimates did not span the parameter space ( Figure 5). Survival was generally estimated to be between 0.5 and 0.7, which is near the middle of the logit parameter space, while recovery was generally <0.1, which is near the lower bound of the logit parameter space ( Table 3).

F I G U R E 3
If correlation between the annual random effects for survival and recovery is interpreted using previously sug- Inference about the variability of mean and annual mallard survival also depended on prior choice (Table 3). Annual survival estimates were most variable when we fit models using Wishart priors while models fit using Uniform priors were slightly more variable than models fit using Gamma priors (Figures 3, 4, 6). This variability was evident by (1) greater Bayesian standard deviations for mean survival estimates (  (3) less precisely estimated random effects ( Figure 6). Unlike survival, the variability of mean recovery and annual recovery estimates was generally insensitive to prior choice ( Figure 6, Table 3 columns 4-6).

| Power analysis
Our power analysis indicated data from intensive monitoring and models fit with Uniform priors recovered our reference values of correlation between survival and recovery R,HY = − 0.801, R,AHY = − 0.787 with more reliability than any other combination of monitoring scenario and prior choice ( Table 4, Appendix S2). With modest data, the opposite was true as we had little, if any, power to detect negative correlation between survival and recovery ( Table 4, Appendix S2). Correlation was more likely to be estimated with severe bias and higher sensitivity to both effective sample size and prior choice than mean survival (Figure 7). Prior influence extended beyond estimates of correlation to estimates of annual survival, but not annual recovery, with annual survival being most variable when estimated using Wishart priors and less similarly less variable when estimated with Uniform or Gamma priors ( Figure 8). Below, we summarize our results by the intensive, intermediate, and modest monitoring scenarios.

| Intensive monitoring scenario
The median estimate of correlation from all data realizations corresponding to our intensive monitoring scenario (10,000 tags per age class annually) was close to our reference values for correlation when we used Uniform priors ̂ HY = − 0.841,̂ AHY = − 0.753 . With Gamma priors, correlation estimates were more negative than when we fit models using Uniform priors ̂ HY = − 0.928,̂ AHY = − 0.812 with results for juveniles being more sensitive and more negative than results for adults ( Table 4). With intensive monitoring and Uniform priors, about 83% and 64% of the posterior estimates of correlation were in the range � < − 0.7 for juveniles and adults, respectively. Implementing Gamma priors resulted in about 93% and 81% of ( R,HY = − 0.801, R,AHY = − 0.787) TA B L E 3 Survival (Ŝ) and recovery (f ) probabilities we estimated for midcontinent mallards between 1961 and 1996.  These posterior estimates appear to be well estimated as the posterior distributions are peaked and do not span the parameter space (Appendix S2) while substantially underestimating the magnitude of negative correlation such that 56% of these estimates for juveniles and 53% of these estimates of adults fell in the range − 0.3 < � < 0.3.

| Modest monitoring scenario
Our results did not meaningfully vary between constant and episodic monitoring scenarios in which cohort size was modest (Table 4), therefore we only summarize the modest scenario with constant cohort sizes of 250 juveniles and 800 adults in the text while presenting results for both modest scenarios ( Table 4, Appendix S2). The median estimates of age-specific correlation from models with Uniform priors fit to our modest data realiza-  (Table 4). Unlike the correlation estimates from Uniform and Gamma priors, the combination of modest data and Wishart priors provided estimated correlation parameters that were peaked without spanning the parameter space, thereby not providing obvious evidence these data were inadequate (Appendix S2). If we were to adopt an approach of rejecting additive harvest on the criteria of the 95% credible interval overlapping 0, we would conclude harvest was compensatory 100% of the time with every modest data realization ( Table 4, Appendix S2) even though correlation between survival and recovery was strongly negative.

| DISCUSS ION
After using previously published methods for estimating and interpreting correlation between survival and recovery (e.g. Arnold et al., 2016), we advise against drawing strong conclusions for mallards given the inferential issues we uncovered. This is despite mallards being the most abundant duck species in North America and the waterfowl species with the most abundant tag-recovery data.
Particularly for juvenile females and adult males, the imprecision of these estimates precludes conclusions stronger than (1) correlation is more negative than not and (2) correlation is not strongly negative.
Furthermore, our results depended on prior choice such that in the absence of a power analysis or comparison of different priors, we would not have any basis for concluding one prior to be more (or less) capable of recovering true parameters than another.
It is only through our power analysis that we could conclude that using Gamma priors will likely lead to overestimating the magnitude TA B L E 4 Summary of age-specific correlation estimated for each monitoring scenario and prior distribution (Wishart, Uniform, Gamma) that we considered in our power analysiŝ of negative correlation and Wishart priors likely underestimated the magnitude of negative correlation between survival and recovery.
The discrepancies between results obtained with different priors and sample sizes highlight the potential for compromised or misleading inference from Bayesian analyses like ours when the model's behavior and power are not explored. Given the importance and interpretation applied to such correlation estimates in management and conservation contexts, we cannot overstate the potential for data limitations and prior choice, seemingly idiosyncratic modeling issues, to result in misleading inference that potentially leads to misguided harvest management recommendations.
With Wishart priors, we would infer female survival is more responsive to environmental conditions, and thereby less sensitive to harvest, when compared to results obtained using Gamma priors or Uniform priors ( Figure 6). If annual survival is more variable and less precisely estimated with Wishart priors while annual recovery is insensitive to prior choice ( Figure 6), then underestimating the magnitude of correlation with Wishart priors is an expected outcome relative to Uniform and Gamma priors. Reduced sensitivity of recovery estimates to prior choice is not entirely a surprise given the proximity of recovery estimates to the more-precisely estimated boundary of the parameter space (Gelman et al., 2014) and the direct link between the data (recoveries) and recovery probability.
Overestimating the variability of survival and underestimating correlation with Wishart priors is consistent with results from Fay et al. (2021); these authors found heterogeneity among individuals was overestimated and correlation between traits underestimated with multivariate hierarchical models fit to capture-recapture data.
The sensitivity of annual survival estimates to prior choice has implications beyond correlation analyses like ours to applications like sensitivity and elasticity analyses of vital rates with Bayesianintegrated population models (Arnold, Clark, et al., 2017;Koons et al., 2017). Our results indicate that the contribution of survival to population growth could be over-or underestimated if the priors F I G U R E 7 Median estimates of correlation displayed by each prior distribution and monitoring scenario combination; Wishart (blue), Uniform (gray), and Gamma (yellow). Fifty estimates are displayed for each combination of age, monitoring scenario, and prior distribution. Points are horizontally jittered by the difference between the hierarchical mean survival estimate S for each age class and the age-specific median survival probabilities of our simulated population (S HY = 0.576 and S AHY = 0.572). For reference, the mean survival estimates for juveniles in sampling scenario 1 estimated with Wishart priors ranges from 0.525 to 0.646, and these points are jittered relative to the value of 0.576. The horizontal lines (dashed) correspond to true correlation between survival and recovery (ρ R,HY = −0.801, ρ R,AHY = −0.787). for tag-recovery models within integrated population models led to over-or underestimating the variability of annual survival.
Our results also demonstrate that biological plausibility of parameters like mean survival and recovery (Table 3) does not ensure reliability of all the parameters estimated from a model, such as correlation between survival and recovery. Especially with our modest data realizations, mean survival estimates were biologically reasonable when corresponding estimates of correlation severely underestimated negative correlation between survival and recovery (Figure 7).
The tendency of correlation estimates to span the parameter space when using models fit with Uniform or Gamma priors indicates data are insufficient to estimate random effects that do not overlap 0. This is problematic because similar sample sizes have been believed sufficient in several published analyses, but it appears reliable inference is not achievable with those modest sample sizes. Moreover, we are usually restricted to a single data set from which to draw inference and not the 50 realizations, such that the inadequacy of modest data for obtaining inference unclouded by sampling variability is not apparent when reviewing estimates of parameters like mean survival.
While others have recognized the inadequacy of some tag-recovery data for obtaining parameter estimates useful for informing waterfowl management (Sheaffer & Malecki, 1995), our findings about the impact of sample sizes on posterior inference is at least as important a result as the somewhat more expected influence of Bayesian priors. This is especially so because the sample sizes we found inadequate for detecting negative correlation have been thought more than adequate in recently published analyses despite previous cautions that these correlation analyses may have low power to detect negative correlation or additive mortality (Sedinger et al., 2010).
While power analyses or simulation studies should accompany complex empirical analyses, carefully assessing the support for multiple competing hypotheses applied to a posterior distribution (Wade, 2000) can also help avoid overconfident interpretation of results. In applications like ours, posterior distributions that span the parameter space or are bimodal (Appendix S2) simultaneously lend support to mutually exclusive ecological interpretations (Arnold, Afton, et al., 2017) (Table 4). For example, our results from modest data and Uniform priors could be interpreted as providing tentative support for moderate-to-strongly additive harvest, support for compensatory harvest (because the posterior distribution substantially overlaps 0), or inconclusive due to inadequate data (Appendix S2). We believe the latter interpretation-inadequate data-would be most appropriate.
The challenges we document in estimating correlation with accuracy are partially attributable to the sparsity of tag-recovery data for species like mallards (or other waterfowl more generally). Our calculations of known-fate years revealed that only 0.1 and 0.2 years of known-fate data were obtained for every female and male mallard that was released with a band, respectively (Table 1). Given the relative sparsity of the female data, we focused our power analysis on female vital rates and sample sizes like those available for female mallards, northern pintail (Bartzen & Dufour, 2017), and lesser scaup (Arnold et al., 2016). Our example of 850 releases per year over 36 years provides a total sample of 28,800 adults ( Table 2; AHY 850 ). Of these adults, an average of 70-75 adults were known to be alive during each year of the study and our models relied on about 26 direct and about 33 indirect recoveries per year for parameter estimation ( Table 2).
These summary statistics emphasize the modest nature of these data sets, which may lead to less intrepid interpretations of parameters estimated from tag-recovery data sets. We note that the efforts of the USGS Bird Banding Lab to increase reporting probability in the 1990s approximately doubled band-reporting probability from an average of 0.3-0.4 to 0.7-0.8 (Arnold et al., 2020), thereby we expect statistical power per banded mallard to have doubled in years since 1996.
Instead of recommending these priors distributions (or others) for future correlation analyses or trying to explain why a prior works better for one analysis than another, we emphasize that variable prior influence among different priors is an expected outcome of Bayesian estimation. Careful consideration of the hypotheses associated with a prior and simulation may be the only way to avoid incorrect inference in some applications. In the case of the standard deviation of a random effect ( ), trying to interpret this parameter may help understand why some priors are more useful than others. The hypothetical prior ∼ 0 (or an estimate of ̂ = 0) could be interpreted as the observed data and truth being the same. Similarly, the hypothetical prior ∼ ∞ (or an estimate ̂ = ∞) could be interpreted as the data being of such extreme variability that they cannot be used to approximate truth. If we implement models with priors closer to ∼ ∞ (the Wishart prior), then correlation estimates tending toward 0 are an expected outcome; two parameters estimated from a prior hypothesizing infinitely variable data should not be expected to be highly correlated. Further, prior influence should be expected to be greater when data are relatively sparse (Gelman et al., 2014) and even more so when working with presence-absence data instead of continuous data (Fay et al., 2021).
While there are more circumstances that impact Bayesian estimation beyond prior choice than we can consider here, two aspects of our results warrant brief mention. First, median correlation estimates from our power analysis were more negative for juveniles than adults (Table 4). At the same time, juvenile survival and recovery probabilities ( Figure 3b) spanned a greater range of values than for adults ( Figure 4b). The range of values occupied by estimated parameters relative to the uncertainty of these estimates should be an important consideration when assessing the feasibility of estimating nonzero correlations; as the true variation of two parameters increases relative to sampling variation, correlation between these parameters should be easier to detect. Second, our results indicate survival probabilities were sensitive to prior choice, while recovery probabilities were generally insensitive to prior choice ( Figure 6). We predict that prior choice would be less important if, for example, we were focused on estimating correlation between juvenile and adult recovery. Additionally, we predict correlation between juvenile and adult recovery would be estimated with greater precision than correlation between juvenile and adult survival, which were estimated with less precision (Figure 6). Considerations like these highlight the need for simulation work that closely matches the circumstances of complex analyses of capture-recapture data (Fay et al., 2021;Riecke et al., 2019), including tag-recovery data.
Our power analysis indicates previous correlation analyses between survival and recovery (Arnold, Afton, et al., 2017;Arnold et al., 2016;Bartzen & Dufour, 2017) applied ecological interpretations to results that may not be statistically robust due to unrecognized data insufficiency ( Table 5). In both cases, these analyses are characterized by a variety of potential deficiencies that led us to suspect the power of these analyses should be reevaluated.
Future Bayesian analyses applying multivariate hierarchical models to tag-recovery data to assess the impact of harvest on population dynamics, including integrated population models with tag-recovery models in the joint likelihood (Arnold, Clark, et al., 2017;Koons et al., 2017), should include clearer descriptions of methods and prior distributions. Similar studies should also demonstrate through simulation or power analyses that the ecological questions being assessed can be answered with the data and statistical methods employed. We suspect some of the concerns identified in this manuscript are more broadly applicable to datatypes and hierarchical models beyond tag recoveries and the specific model type we evaluated.

ACK N OWLED G M ENTS
We acknowledge the groundbreaking work of C. Brownie We thank two anonymous reviewers for invaluable and constructive feedback. This work has been supported with funding from the Vice Chancellor of Research at the University of Alaska and the Arctic Goose Joint Venture.

CO N FLI C T S O F I NTE R E S T S TATE M E NT
No conflict of interest to declare.

DATA AVA I L A B I L I T Y S TAT E M E N T
The organized data (public data from USGS Bird Banding Lab) and R code files that support the results of this study are available at Dryad.
https://datadryad.org/stash/share/Vpl3oxfrDk_AOlkos32KnIuc0y-2Rk7oSu5ON xiA6Yw TA B L E 5 Possible paths to incorrect inference when negative correlation between survival and recovery is intepreted as support for additive harvest mortality but may also result from confounding between liberalized harvest opportunity and increased natural mortality when density dependent regulation increases with population size (Sedinger & Herzog, 2012).